Multi-cue recall (v3.4 - v3.6)

Summary

Memory recall is often modeled as a process of evidence accumulation. Existing models typically assume that this accumulation process is passive, and not subject to top-down control. In contrast, recent work in perceptual and value-based decision making has suggested that similar kinds of evidence accumulation processes are guided by attention, such that evidence for attended items is accumulated faster than for non-attended items. Furthermore, attention may be adaptively allocated to different items in order to optimize a tradeoff between decision quality and the computational cost of evidence accumulation. In this project, we ask whether similar forces are at play in the context of memory recall.

Such a model predicts that, when multiple memories are relevant, people will focus their efforts on recalling the target which is more strongly represented in memory, because it can be recalled with less effort. Here we present a simple form of such a model, and test this key prediction in a cued-recall experiment in which participants can select which of two possible targets to remember. We find tentative support for a model in which memory search is guided by partial recall progress in order to minimize the time spent recalling.

Model

We model memory recall as a process of evidence accumulation. As in the DDM or LCA, we assume that evidence is sampled at each time step and that recall occurs when the total evidence hits a threshold. To make the model solvable (by dynamic programming) we assume that the evidence for each target follows a Bernoulli distribution \[ x_t \sim \text{Bernoulli}(p), \] where \(p\) corresponds to the strength of the memory. This image shows several possible traces of evidence accumulation for a single item:

knitr::include_graphics("figs/accumulation.png")

When multiple memories are relevant, each one has a separate accumulator. Critically, we do not assume that evidence is sampled for each item in parallel. Instead, at each time step, a policy selects one of the targets and accumulates evidence for only that target. Solving the resulting Markov decision process, we find that the optimal policy generally converges on the target with maximal memory strength (highest \(p\)) and only draws samples for that target until it is recalled. This is illustrated in the following plot:

knitr::include_graphics("../model/figs/simple_fixation.png")

Experiment

To test the model’s predictions, we developed a modified cued-recall experiment in which participants were presented with two cues (images) on each trial and could recall the target (word) associated with either one. To create an observable behavioral correlate of targeted memory search, only one cue is visible at a time and participants use the keyboard to display each in turn. This is basically a cheap alternative to eye-tracking. The assumption is that people will look at the image they are currently trying to remember the word for. See a demo here.

knitr::include_graphics("figs/task.png")

The model predicts that people will spend more time looking at the cue for which the memory of the corresponding target is stronger. To test this prediction, we need a way to measure memory strength. Early attempts to manipulate memorability using established norms were not successful. Thus, we instead take the approach of measuring the strength of each cue-target pair in an earlier phase of the experiment. We’ve tried two strategies.

  1. Standard cued recall task: operationalize memory strength of each pair as the average log reaction time for trials in which the participant recalled the target (with a fixed penalty for failure trials).
  2. Reverse 2AFC task: operationalize memory strength as the average log reaction time to choose the correct image out of two options, when presented with the word. Note that we have reversed the role of cue and target here in order to reduce the likelihood that people develop metacognitive awareness of which images they know well in this pretest phase.

Overall, the results are stronger using the second approach, so I’ll focus on those. Results with approach 1 can be found here.

Results

Probability of remembering first word

As a sanity check, we first ask if our memory strength index predicts participants choices about which image to recall. We operationalize memory strength as the within-participant Z-scored, mean log 2AFC RT (that’s a mouthful!) If this index tracks memory strength, we should see that the probability that a target is chosen for recollection should depend on the difference in “strength” between the two cues.

Note: For the model, strength corresponds to the \(p\) parameter. Here, we simply Z-score \(p\) to put it in roughly the same units. The assumption that \(p\) is linearly related to log reaction time is almost certainly wrong, but it’s enough to get a sense of qualitative model predictions.

df %>% 
    ggplot(aes(rel_strength, as.numeric(choose_first))) + 
    geom_smooth(method = "glm", method.args = list(family = "binomial"), formula=y~x) +
    stat_summary_bin(fun.data=mean_cl_boot, bins=5) + 
    facet_wrap(~name) +
    labs(x="First Cue Memory Strength", y="Prob Select First Cue")

X = df %>% filter(name == "Human")
lmer(choose_first ~ rel_strength + (rel_strength|wid), data=X) %>% summ
Fixed Effects
Est. S.E. t val. d.f. p
(Intercept) 0.646 0.022 29.092 65.713 0.000
rel_strength 0.080 0.010 7.687 64.305 0.000
p values calculated using Satterthwaite d.f.
# lmer(choose_first ~ strength_first + strength_second + (strength_first + strength_second |wid), data=X) %>% summ

Note: unless otherwise stated, all regression tables are maximal linear mixed effects models (random slopes and intercepts) applied to the human data. Plotted regression lines are fixed effects only.

Fixation proportion by relative memory strength

The simplest test of rational memory: do people spend more time looking at the cue that they have a stronger memory of? This analysis only considers trials where both cues are seen.

df %>% 
    filter(n_pres >= 2) %>% 
    ggplot(aes(rel_strength, prop_first)) + 
    geom_smooth(method = "lm", formula="y~x") +
    stat_summary_bin(fun.data=mean_cl_boot, bins=5) + 
    facet_wrap(~name) +
    labs(x="Relative Memory Strength", y="Proportion Fixate First")

X = df %>% filter(n_pres >= 2 & name == "Human")
lmer(choose_first ~ rel_strength + (rel_strength|wid), data=X) %>% summ
Fixed Effects
Est. S.E. t val. d.f. p
(Intercept) 0.416 0.023 17.967 42.445 0.000
rel_strength 0.096 0.013 7.372 77.050 0.000
p values calculated using Satterthwaite d.f.

They do! 🎉

But hold on, what’s going on with the random model…

Last fixation effect

People remember the thing they look at last 94% of the time. In value-based decision making, it has been suggested that this last-fixation effect could explain away what looks to be evidence for adaptive attention allocation. This claim did not hold up there, but it could nevertheless be affecting our results.

df %>% 
    filter(n_pres >= 2) %>% 
    ggplot(aes(rel_strength, prop_first, color=last_pres)) + 
    geom_smooth(method="lm") + 
    stat_summary_bin(fun.data=mean_cl_boot, bins=5) + 
    facet_wrap(~name) +
    theme(legend.position="top") +
    labs(x="Relative Memory Strength", y="Proportion Fixate First Cue", color="Last Fixation")

X = df %>% filter(n_pres > 1 & name == "Human")
lmer(prop_first ~ rel_strength * last_pres + (rel_strength * last_pres | wid), data=X) %>% summ
Fixed Effects
Est. S.E. t val. d.f. p
(Intercept) 0.709 0.012 60.922 48.144 0.000
rel_strength 0.009 0.005 1.680 72.414 0.097
last_pressecond -0.335 0.018 -18.698 46.039 0.000
rel_strength:last_pressecond -0.006 0.007 -0.840 409.405 0.402
p values calculated using Satterthwaite d.f.

Indeed, the effect of relative memory strength on fixation proportion (nearly?) disappears when we control for the effect of the last fixation. Read on for an explanation, but feel free to skip to the next section.

What’s going on here? Basically it is due to the last fixation being correlated with both fixation proportion and relative strength. The first item is fixated longer on trials where the first cue is seen last (because it has one more fixation than the other cue).

df %>%
    filter(n_pres < 8) %>% 
    ggplot(aes(n_pres, prop_first, color=last_pres)) +
    facet_wrap(~name) +
    labs(x='Number of Fixations', y='Proportion Fixate First Cue') +
    stat_summary(fun.data=mean_cl_boot)

Furthermore, relative strength is higher for trials where the last fixation is on the first item. This is (presumably) because people tend to remember the cue they look at last and they are also more likely to remember the stronger pair.

df %>% 
    filter(n_pres < 8) %>% 
    ggplot(aes(n_pres, rel_strength, color=last_pres)) +
    facet_wrap(~name) +
    labs(x="Number of Fixations", y="Relative Memory Strength") +
    stat_summary(fun.data=mean_cl_boot)

Putting the last two effects together, we get a correlation between strength and fixation proportion based only on the target of the last fixation.

df %>% 
    ggplot(aes(rel_strength, prop_first, color=last_pres)) +
    # geom_point() + 
    # geom_errorbar(aes(ymin=prop_first_mean - prop_first_sd, ymax=prop_first_mean + prop_first_sd)) +
    # geom_errorbarh(aes(xmin=rel_strength_mean - rel_strength_sd, xmax=rel_strength_mean + rel_strength_sd)) +
    geom_density_2d() +
    facet_wrap(~name) +
    theme(legend.position="top") +
    labs(x="Relative Memory Strength", y="Proportion Fixate First Cue", color="Last Fixation")

It might be easier to see if we group the data by number of fixations:

df %>% 
    filter(between(n_pres, 1, 6)) %>% 
    group_by(name, last_pres, n_pres) %>% 
    summarise(across(c(rel_strength, prop_first), list(mean=mean, sd=~ sd(.x) / sqrt(length(.x))))) %>%
    ggplot(aes(rel_strength_mean, prop_first_mean, color=last_pres, label=n_pres)) +
    geom_errorbar(aes(ymin=prop_first_mean - prop_first_sd, ymax=prop_first_mean + prop_first_sd)) +
    geom_errorbarh(aes(xmin=rel_strength_mean - rel_strength_sd, xmax=rel_strength_mean + rel_strength_sd)) +
    facet_wrap(~name) +
    theme(legend.position="top") +
    labs(x="Relative Memory Strength", y="Proportion Fixate First Cue", color="Last Fixation") +
    geom_label()

First fixation duration

Given the complications with the total fixation duration, the duration of the first fixation is potentially a more reliable cue. The model makes a strong prediction here, and it probably isn’t confounded by the last fixation effect.

Consider only the trials with more than one fixation. Do participants switch away from the first image more quickly when they have a worse memory of it?

df %>% 
    filter(n_pres >= 2) %>% 
    ggplot(aes(strength_first, first_pres_time)) +
    geom_smooth(method="lm") + 
    stat_summary_bin(fun.data=mean_cl_boot, bins=5) + 
    facet_wrap(~name) +
    labs(x="First Cue Memory Strength", y="First Fixation Time")

human %>% 
    filter(n_pres >= 2) %>% 
    lmer(first_pres_time ~ strength_first + (strength_first|wid), data=.) %>% summ
Fixed Effects
Est. S.E. t val. d.f. p
(Intercept) 1309.409 89.287 14.665 44.387 0.000
strength_first 37.163 24.354 1.526 28.851 0.138
p values calculated using Satterthwaite d.f.

No sign of an effect here.

Second fixation duration

We can do a similar analysis for the second fixation duration. In this case, however, the duration may depend on the strength of both the first and second cues. As before, we exclude final fixations in this plot (i.e. excluding trials less than three fixations).

df %>% 
    filter(n_pres >= 3) %>% 
    ggplot(aes(rel_strength, second_pres_time)) +
    geom_smooth(method="lm") + 
    stat_summary_bin(fun.data=mean_cl_boot, bins=5) + 
    facet_wrap(~name) +
    labs(x="Relative Memory Strength", y="Second Fixation Time")

X = human %>% 
     filter(n_pres >= 3) 

X %>% lmer(second_pres_time ~ rel_strength + (rel_strength|wid), data=.) %>% summ
Fixed Effects
Est. S.E. t val. d.f. p
(Intercept) 1116.328 95.010 11.750 43.342 0.000
rel_strength -69.369 32.506 -2.134 30.950 0.041
p values calculated using Satterthwaite d.f.
# X %>% lmer(second_pres_time ~ strength_first + strength_second + (strength_first + strength_second|wid), data=.) %>% summ

Look at that, a significant effect!

Time course

Finally, we can make a plot similar to the initial model prediction. How does the probability of fixating the more-memorable option change over the course of the trial? We normalize the x axis by the duration of the trial to reduce noise (it’s really messy with raw time).

make_fixations = function(df) {
    df %>% 
        filter(n_pres >= 1) %>% 
        ungroup() %>% 
        mutate(
            strength_diff = cut(abs(rel_strength), 
                                quantile(abs(rel_strength), c(0, 0.2, 0.6, 1),  na.rm = T),
                                labels=c("small", "moderate", "large"),
                                ordered=T)
        ) %>% 
        mutate(trial = row_number()) %>% 
        unnest_longer(presentation_times, "duration", indices_to="presentation") %>% 
        mutate(
            fix_first = presentation %% 2,
            fix_stronger = as.numeric(fix_first == (rel_strength > 0)),
        )
}

normalized_timestep = function(long) {
    long %>% 
        group_by(trial) %>%
        mutate(prop_duration = duration / sum(duration)) %>% 
        ungroup() %>% 
        mutate(n_step=round(prop_duration * 100)) %>% 
        uncount(n_step) %>% 
        group_by(trial) %>% 
        mutate(normalized_timestep = row_number())
}

long = make_fixations(df)
nts = normalized_timestep(long)
nts %>% 
    drop_na(strength_diff) %>% 
    ggplot(aes(normalized_timestep/100, fix_stronger, group = strength_diff, color=strength_diff)) +
    geom_smooth(se=F) + 
    ylim(0, 1) +
    facet_wrap(~name) +
    labs(x="Normalized Time", y="Probability Fixate Stronger Cue", color="Strength Difference") +
    geom_hline(yintercept=0.5) +
    theme(legend.position="top")